Fitting class-based language models into weighted finite-state transducer framework

نویسندگان

  • Pavel Ircing
  • Josef Psutka
چکیده

In our paper we propose a general way of incorporating classbased language models with many-to-many word-to-class mapping into the finite-state transducer (FST) framework. Since class-based models alone usually do not improve the recognition accuracy, we also present a method for an efficient language model combination. An example of a word-to-class mapping based on morphological tags is also given. Several word-based and tag-based language models are tested in the task of transcribing Czech broadcast news. Results show that class-based models help to achieve a moderate improvement in recognition accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unified language modeling using finite-state transducers with first applications

In this paper, we investigate a weighted finite-state transducer approach to language modelling for speech recognition applications. We explore a unified framework to conversational speech recognition which combines the benefits of grammars, n-gram and class-based language models, with the flexibility of using dynamic data, and the potential for integrating semantics. Based on a virtual persona...

متن کامل

A Specialized WFST Approach for Class Models and Dynamic Vocabulary

In this paper we describe a specialized Weighted Finite State Transducer (WFST) framework for handling class language models and dynamic vocabulary in automatic speech recognition. The proposed framework has several important features, a fused composition algorithm that substantially reduces the memory usage in comparison to generic WFST operations, and an efficient dynamic vocabulary scheme th...

متن کامل

Dysarthric Speech Recognition Based on Error-Correction in a Weighted Finite State Transducer Framework

In this paper, a dysarthric speech recognition error-correction method in a weighted finite state transducer (WFST) framework is proposed to improve the performance of dysarthric automatic speech recognition (ASR). To this end, pronunciation variation models are constructed from a context-dependent confusion matrix based on a weighted Kullback-Leibler (KL) distance between triphones. Then, a WF...

متن کامل

Designing a Non-Finite-State Weighted Transducer Toolkit

Toolkits for weighted finite-state machines (WFSM’s) have proven to be tremendously useful in a wide variety of speech and language applications. While WFSM’s can directly represent finite-state statistical models such as hidden Markov models, this is not the case for many models of interest. In this paper, we consider extending a WFSM toolkit to a non-finite-state formalism. We select a formal...

متن کامل

Incremental Language Models for Speech Recognition Using Finite-state Transducers

In the context of the weighted finite-state transducer approach to speech recognition, we investigate a novel decoding strategy to deal with very large n-gram language models often used in large-vocabulary systems. In particular, we present an alternative to full, static expansion and optimization of the finite-state transducer network. This alternative is useful when the individual knowledge s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003